rank | frequency | n-gram |
---|---|---|
1 | 108431 | -n |
2 | 79776 | -e |
3 | 55218 | -s |
4 | 45252 | -r |
5 | 37480 | -t |
rank | frequency | n-gram |
---|---|---|
1 | 77112 | -en |
2 | 36468 | -er |
3 | 18191 | -ng |
4 | 13930 | -es |
5 | 13696 | -te |
rank | frequency | n-gram |
---|---|---|
1 | 15499 | -ten |
2 | 14152 | -ung |
3 | 11987 | -gen |
4 | 9195 | -hen |
5 | 7297 | -che |
rank | frequency | n-gram |
---|---|---|
1 | 8112 | -chen |
2 | 6348 | -ngen |
3 | 4006 | -sche |
4 | 3676 | -tion |
5 | 3326 | -cher |
rank | frequency | n-gram |
---|---|---|
1 | 4881 | -ungen |
2 | 4617 | -schen |
3 | 3449 | -ische |
4 | 2085 | -erung |
5 | 1986 | -ation |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings